Fully Character-Level Neural Machine Translation without Explicit Segmentation
نویسندگان
چکیده
Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subwordlevel encoder on WMT’15 DE-EN and CSEN, and gives comparable performance on FIEN and RU-EN. We then demonstrate that it is possible to share a single characterlevel encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We also observe that the quality of the multilingual character-level translation even surpasses the models trained and tuned on one language pair, namely on CSEN, FI-EN and RU-EN.
منابع مشابه
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation
The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder– decoder with a subword-level enco...
متن کاملNYU-MILA Neural Machine Translation Systems for WMT'16
We describe the neural machine translation system of New York University (NYU) and University of Montreal (MILA) for the translation tasks of WMT’16. The main goal of NYU-MILA submission to WMT’16 is to evaluate a new character-level decoding approach in neural machine translation on various language pairs. The proposed neural machine translation system is an attention-based encoder–decoder wit...
متن کاملTowards Neural Machine Translation with Latent Tree Attention
Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree stru...
متن کاملA Character-Aware Encoder for Neural Machine Translation
This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences of characters rather than words. On the use of row convolution (Amodei et al., 2015), the encoder of the proposed model composes word-level information from the input sequences of characters automatically. Since our model doesn’t rely on the boundaries between each wo...
متن کاملNeural Sequence-to-sequence Learning of Internal Word Structure
Learning internal word structure has recently been recognized as an important step in various multilingual processing tasks and in theoretical language comparison. In this paper, we present a neural encoder-decoder model for learning canonical morphological segmentation. Our model combines character-level sequence-to-sequence transformation with a language model over canonical segments. We obta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- TACL
دوره 5 شماره
صفحات -
تاریخ انتشار 2017